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SYSTEM OF ARTIFICIAL INTELLIGENCE JK 
CLASSIFI CATION OF EVENTS, SUBJECTS OR SITUATIONS FROM 
SIGNALS AND DISCRIMINANT PARAMETERS PRODUCED BY MODELS 



Technical field 

The present invention relates to a system of 
artificial intelligence for the classification of 
events, objects or situations from signals and 
discriminant parameters produced by models. 

In particular, the invention applies to the 
classification of seismic events. Such a classification 
will be considered as a non- limiting example in the 
following description. 

State of Prior Art 

Automatic classification of seismic events 

■ Automatic classification of seismic events is a 
relatively recent problem, since the problem was only 
really approached in the 1980 's. These works were 
generally oriented towards the search for discriminant 
parameters {that is say making classification possible) 
in seismic signals. Many potential characteristics were 
proposed with a view to future automatic 
classification. After 1990, attempts to carry out 
automatic classification began to appear in published 
articles, either using neural techniques or rule-based 
systems . These works sought to separate earthquakes of 
natural origin from explosions. None of these articles 
processes the discrimination of rock failure in the 
mining industry ("rock bursts") . 
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Because of the complexity of the problem, these 
articles clearly demonstrate the need for perfecting 
automatic systems capable of learning. Neural methods 
are therefore often proposed to carry out automatic 
discrimination of seismic events, but with limitations 
analysed below. The models suggested most often are 
multilayer perceptrons with complete connections 
between successive layers. 

All these articles seek to determine the origin of 
the earthquake starting from characteristics extracted 
solely from the signals. The highest level data (date, 
time, latitude, longitude, magnitude etc.) are never 
used for classification. But seismologists know the 
difficulty of discriminating seismic signals through 
low level processing alone. 

Works by Baumgardt and his colleagues described in 
document reference [1] are without doubt those which 
have contributed to making the greatest advances in the 
search for discriminant parameters. 

The variations of the cepstrum, the cepstrum of a 
signal x being the inverse Fourier transform of the 
logarithm of the Fourier transform of x, are often 
used. It can thus be shown that the cepstrum makes it 
possible to visualise the phenomenon of micro-delays 
present in the blast signals characterised by a greater 
variance. The document reference [2] also notes this 
property, nonetheless pointing out that the absence of 
this characteristic does not permit any deduction 
concerning the class of the event. 

The ratios of the amplitudes of the different 
types of waves can also serve as discriminants. The 



B13340 .3DB 



document reference [3] studies a whole series of 
amplitude ratios (Pn/Lg, Pg/Lg, Lg/Rg) . These ratios are 
described as being able to provide good discrimination. 

The same authors also introduce ratios of spectral 
densities of power of the different types of waves 
detected. Just like the amplitude ratios, these 
discriminants are used by all the studies seeking 
discriminants in seismic signals. In order to 
characterise the explosions, ratios have also been used 
for power spectral densities of one wave type, S here, 
for bands of different frequencies, that is to say the 
ratio of the power spectral density of S in the range 
1-2 Hz to the power spectral density of this same phase 
in the 7-20 Hz band. The ratio between the power 
spectral densities of the S wave below and above 10 Hz 
is also given as a good separator between explosions 
and earthquakes. 

The document reference [4] notes that the 
propagation time of signals from a mine has a constant 
time tsg-tpg for a given recording station. This 
propagation time is presented as a potential 
characteristic for a mine, nevertheless remaining less 
reliable than the preceding characteristics. 

The document reference [5] suggests using the 
presence of the surface wave from earthquakes to 
discriminate them from nuclear explosions at regional 
distances. Characterisation of the presence of a 
surface wave is made indirectly by comparing the 
magnitudes mb and Mg. For two seismic events of the same 
magnitude mb, the magnitude of the surface wave Mg is 
generally higher in the case of an earthquake because 
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of the presence of the surface wave than in the case of 
an explosion. In fact, this crustal Rayleigh wave 
enters into the calculation of the magnitude Mg and its 
presence is subordinate to the phenomenon of shearing, 
absent in the case of nuclear explosions. 
Representation of the difference (mb-Ms) as a function 
of mb makes it possible to verify this hypothesis. 
Nevertheless, the calculation of the magnitude Ms 
depends on the periodicity of the signal recorded and 
it is not possible to be rigorous for regional events. 
On the contrary, the presence of a surface wave 
corresponding to a sedimentary Rayleigh wave in close 
seismic signals, characterises events of an artificial 
nature. A detection method for this second type of 
surface wave consists of searching for its presence 
directly in the spectrogram of the signal, its 
frequency being known (between 0.5 and 1.5 Hz) and its 
supposed time of arrival can be calculated from its 
average speed of propagation and the distance 
separating the epicentre from the recording station. 

The systems described in documents of prior art 
are not operationally credible for several reasons: 

• studies carried out by geophysicists , usually 
rich and detailed concerning proposals for discriminant 
parameters, do not suggest any reliable method for 
automatic exploitation of these parameters. 

• studies carried out by computer specialists 
propose systems that do not take sufficient account of 
the complementarity of data and geophysical knowledge. 

Most of prior art studies use data bases of 
seismic events of extremely reduced size, with the 
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120 consequence that they do not permit correct statistical 
learning. Classification is usually carried out on 
bases with fewer than one hundred events, as described 
in the documents reference [2] and [6] . One of the 
biggest bases found in prior art documents comprises 

125 only 312 events, as described in document [4] . The 
direct consequence is that the margins of error for the 
results presented are very high, which cannot provide 
great confidence in these results. 

The geographic spread of the examples in the data 

130 base is a very important element. Most bases group 
together events which have taken place in regions of 



f\ restricted size (several tens of kilometres per side) , 



where the geological properties under ground have 



yl little diversity. The search for general discriminants 



13 5 is therefore biased, the discriminants only being 

W effective for a given region. 

E3 

\l Moreover, as described in the document reference 

/!| [1] , the events of the two classes to be discriminated 

can come from two clearly distinct geographic regions, 

14 0 sometimes as far apart as several hundreds of 

kilometres. It is therefore impossible to know to what 
extent the "colouring" of the signals by the geological 
layers travelled through influences the discrimination, 
rather than the signals themselves. But seismologists 

145 know that this "colouring" is far from negligible and 
that localisation information is very important. 

Generally, a very limited number of recording 
stations is used. The signals are recorded by two or 
three stations at maximum, but usually by only one 

150 single station. The seismic event is thus represented 
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by a single signal, which reduces usable information 
considerably. 

Finally, the events integrated into the data base 
are very generally selected according to previously 
155 defined criteria: magnitude greater than a certain 
threshold, signal/noise ratio greater than a certain 
threshold, as described in document reference [2] . But 
this selection evidently biases the results completely. 
Although prior art studies have made it possible 
16 0 to register a wide range of potentially useful 
discriminants for classification, it has proved to be 
very difficult to find efficient global discriminants 
because of the high number of distinct types of 
blasting and earthquakes, 
pi 165 In most cases, the classifiers proposed turn out 

to be inefficient because they are too simple (linear 
|j separators) or impossible to regulate because of their 

f-^ complexity. They demand enormous work on pre-processing 

p the data, which means that the systems proposed cannot 

170 be generalised - 

Example of seismic surveillance 

The detection and geophysical laboratory (LDG) of 
the CEA has continuously surveyed the seismic activity 

175 of the earth since 1962. When a seismic event occurs in 
any spot on the globe, it is recorded in France by a 
network of forty-two vertical seismometers located on 
mainland territory, as shown in Figure 1, the SP 
stations being short -period stations and the LP 

180 stations being long-period stations. A detailed 
description of the network of seismometers and the 



B13340 . 3DB 



7 



il-i 



propagation of seismic waves in France is given in the 
document reference [7] . 

This network, which since its creation, used 

18 5 transmission by Hertzian channels, has recently moved 
to digital transmission by satellite. Filtering and a 
gain adapted to the signal make it possible to detect 
close earthquakes or on the other hand longer period 
distant earthquakes called teleseisms. The regulation 

190 of filtering parameters and gain must make it possible 
to find a compromise between the detection of seismic 
events of relatively low magnitude and background 

CI 

p noise. 

P Figure 2 shows signals recorded by seismometers of 

195 the LDG laboratory located between 84 and 146 

kilometres (that is SBF : 84 km; PGF: 110 km; FRF: 

^ i 

s 127 km; LMR: 136 km and LRG: 148 km) from the estimated 

p{ epicentre of an earthquake of magnitude 1.9 located 10 

Cj kilometres south of Imperia in Italy on May 9, 1996 

p 200 (time: Ihr Omin 59sec; latitude: 43.34; longitude: 

^"'^ 8.19; magnitude : 1.9). 

Different seismic phases are indexed on each 

signal, which will be used for the detailed analysis of 

the event . 

205 Each year, about 9,000 close seismic events are 

thus detected, of which 800 to 1,200 are natural 
earthquakes . 

The seismologists of the LDG laboratory process 
and analyse daily the data recorded by the stations of 
210 the French network. They publish a weekly bulletin 
containing the whole of the natural earthquakes in 
France or in the bordering regions. A similar bulletin 
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is published for teleseisms. Table I, given at the end 
of the description, is an extract from the bulletin for 
the period from September 9 to 15, 1998. In this table, 
there are the following abbreviations: 

TIME OR. : time of origin (TU) ; 

LAT : latitude of the epicentre (deg.); 

LON : longitude of the epicentre (deg.); 

DEP : depth of the epicentre (km) ; 

LM : local magnitude ; 

MSE : mean square error (s) . 

This table thus comprises the ensemble of close 
earthquakes detected during this week and the 
characteristics of each event: date and time of origin, 
position and depth of the epicentre, magnitude, mean 
square error for the localisation and deduced 
localisation region. 

Because of the great imbalance between the number 
of artificial events and that of natural events (in 
France, seismic events of artificial origin are ten 
times more frequent than natural seismic events) , only 
events presumed to be earthquakes or events of non- 
determined class are extracted from the background 
noise by seismologists in order to be analysed more 
precisely later by a localisation software. The other 
signals (mainly artificial events) are archived for six 
months . 

The procedure for analysing the signals comprises 
an already automated localisation, followed by a 
characterisation phase (determination of the event at 
the origin of the signals) . 
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The analysis of the signals is carried out using a 
global system illustrated in Figure 3. 
245 The aim of the invention is to compensate for the 

inconveniences of prior art systems by proposing a new 
system of artificial intelligence for classifying 
events, objects or situations from signals and 
discriminant parameters produced by models. 

250 

Description of the invention 

The present invention relates to a system of 
14^ artificial intelligence for the classification of 

events, objects or situations from signals and 
255 discriminant parameters produced by models. 



y characterised in that it comprises at least one 

processing branch comprising a fuzzy expert system (FES 
^ expert) taking a decision according to high level 

pj properties and lower level discriminant parameters 

^^='1 260 extracted from signals by signal processing type 

E3 procedures, and capable of explaining its decision to 

the user through the intermediary of rules selected by 
order of applicability. 

In this fuzzy expert system a gradient decrease is 
2 65 carried out on the parameters: 

• X = y/a 

• s = ln/2a^) 

• r = In (p) 

• d 

270 with: 

• y: position of fuzzy sets of premises 

• or: width of fuzzy sets of premises 

• p: weight of rules 
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• d: degree of activation of each class for each 

rule . 

Advantageously the system according to the 
invention is a multi-expert system constituted of at 
least two independent processing branches, organising 
themselves automatically through statistical learning 
on the data bases, having particular properties, and 
merged by a high level decisional system. 
Advantageously one branch comprises a neuro- fuzzy 
classifier (NFC expert) taking its decisions from high 
level properties and lower level discriminant 
parameters extracted from signals by signal processing 
type procedures. Advantageously another branch 
comprises a neural network with local connections and 
shared weights (TDNN) constituted of banks of non- 
linear adaptable filters, itself extracting 
discriminant information for time -frequency 
representation of the signals corresponding to the 
event . 

The invention can be used in different fields of 
application and particularly in: 
• Surveillance of geophysical events 

The system then applies to the analysis of any 
geophysical event observable through signals received 
by the stations: 

- seismic signals; 

- infrasound; 

- hydro-acoustic waves. 

These events can be close events (called regional) 
or distant events (for example, teleseisms) . 
The function to be ensured may be: 
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305 - filtering to eliminate the non-relevant events 

for later processing; 

- detection of special events; 

- exhaustive classification into a set of groups 
of events of the same nature. 

310 • Industrial surveillance and monitoring 

The system also applies to the analysis of objects 
or industrial processes, as long as one has available 
signals or images collected by sensors. A few examples 
are given below: 
315 • Quality control of manufactured objects or 

p products: the aim is to verify the shape and/ or 

^^ position of objects, to detect and characterise 

W defects. The NFC and FES detectors use measurements 

b j 

produced by image processing. The TDNN expert uses one 
^ 320 or several images of the part. 

|j • Predictive maintenance of equipment: the aim is 

q to foresee a future failure of machines, computers, 

D electronic equipment, sensors in order to give a 

warning and make it possible to implement a correction 
325 procedure before breakdown. The NFC and FES experts use 
measurements of high level coefficients, and 
correlations. The TDNN expert uses signals. 

• Complex process surveillance: the aim is to 
verify that a production chain is operating correctly. 
33 0 NFC and FES experts use measurements of high level 
coefficients. The TDNN expert uses measurements of low 
level coefficients . 

In the geophysical domain, the high level 
properties can be the location, the magnitude, the time 
335 and the date. The system according to the invention 
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makes it possible to carry out automatic classification 
of seismic events into three sets: 

- natural earthquakes; 

explosions (blasting in mines and military 

340 trials) ; 

- rock bursts (collapse of layers in mines) . 

The system then integrates into a chain of 
automatic processing to operate a filtering function of 
seismic events. Its principal characteristics are: 
345 - maximum reliability: the system is able to take 

decisions even with corrupted or imprecise data, or 

ft 

|«j even in the absence of certain information; 

O - access to the explanation of decisions in order 

y to avoid any eventual doubt about a decision. 



EJ 



Brief description of the drawings 

Figure 1 shows the network of seismometers 
belonging to the laboratory for detection and 
p geophysics (LDG) of the centre for atomic energy (CEA) 

355 in 1998. 

Figure 2 shows examples of seismic signals 
recorded by the network of figure 1. 

Figure 3 shows a prior art global system for 
exploitation of geophysical systems. 
360 Figure 4 shows the diagram of the principle of the 

mult i -expert system for discrimination of seismic 
events according to the invention. 

Figure 5 shows the general learning diagram of the 
system according to the invention, from examples. 
3 65 Figure 6 shows an artificial neuron. 

Figure 7 shows a network of artificial neurons. 
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Figure 8 shows a neuro-fuzzy classifier. 
Figure 3 shows a mechanism for activation of 
coding cells. 

370 Figure 10 shows examples of activation of coding 

cells . 

Figure 11 shows a fuzzy expert system for the 
discrimination of seismic events. 

Figures 12A to 12D show successive pre-processing 
375 applied to a seismic signal. 

Figure 13 is an architecture of a network of 
neurons with local connections and shared weights, 
pj Figure 14 shows the spread of epicentres of 

seismic events between 1962 and 1996. 

W 3 80 

|«| Detailed description of embodiments 

f General description of the system 

lij The system according to the invention comprises at 

least one processing branch containing a fuzzy expert 
385 system. When it comprises several branches of 
independent processing, one refers to a multi-expert 
system. 

• Multi -expert decision making 

The principle of multi-expert decision-making, 
390 which is thus one of the embodiments of the invention, 
is the exploitation of the synergy between several 
complementary processing branches. This complementarity 
resides in: 

A. The general performances: one branch is rather 
395 generalised (fairly good performances in most 

cases) , another is rather specialised (very 
good performance for certain difficult cases, a 
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higher level of errors in cases outside its 
competence) . 

4 00 B. The performances according to the case treated: 

one branch can be more able than another for 
treating a particular case. 
C. The nature of inputs (high level signals or 
data) . 

405 D. The nature of outputs (single data for the 

class, estimation of the certitude of the 
decision, formal explanation of the decision) . 
Figure 4 shows the diagram of the principle of the 
p mult i- expert system for discrimination of sexsmxc 

C| 410 events according to the invention. This system is 

y constituted of several independent processing branches, 

II each with special properties, merged by a high level 

f decisional system. 

liJ These branches are: 

^ 415 - a neuro-fuzzy classifier, NFC, making its 

g decisions from high level properties of events (for 

example for seismic events: localisation, magnitude, 
time, day of the week) and lower level parameters 
extracted from the signals by procedures of the signal 
42 0 processing type; 

- a fuzzy expert system, FES, taking a decision in 
an independent way from the same information, and able 
to explain its decision to the user through the 
intermediary of rules selected by order of 

425 applicability to the event being processed; 

- a neural network with local connections and 
shared weights, TDNN, constituted of banks of non- 
linear adaptable filters, itself extracting the 
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relevant information for time- frequency representations 
430 from signals corresponding to the event. 

These three branches configure themselves 
automatically by statistical learning on the data bases 
of seismic events . 
• Leaiiming from examples 
435 Learning from examples consists of building a 

model of the decision-making system by progressive 
adjustment of parameters based on data. This model must 
be able to associate the right decision (output) to a 
set of data describing the case being processed 
5 440 (inputs) . This is carried out progressively, by 

iterative presentation of cases available in the 
W example base inputting the system. Such a procedure is 

f\ shown in the flow chart of figure 5. 

According to the invention, the learning model can 

445 either be a network of artificial neurons or a fuzzy 

C! 

expert system - 

0 Once the system has ended its learning phase, its 

internal parameters are set and the system is ready for 
use . 

450 • Neural network of the "multilayer Perceptron" type 

Such a network of artificial neurons of the 
"multilayer Perceptron" type is a special model of a 
neural network able to be used as a decision-making 
system. It is constituted of a network of robots for 
455 simple calculations, "artificial neurons". 

A neuron N^, as shown in figure 6, is an entity 
constituted of a weight-vector Wj={wij} and a non-linear 
transfer function (j). It allows a vector input X={x±} 
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and carries out a transformation of these inputs of the 

type yj = <Aj^yv,jxA . 

Similar to the vocabulary used in neurophysiology, 
one says that each input xi is linked to the neuron 
by a synaptic connection. A synaptic weight w^j 
modulates the efficiency of this connection. 
4 65 In a network of artificial neurons, as shown in 

figure 7, the neurons are assembled in successive 
layers. A layer is defined as a set of neurons not 
H' havinq connections between each other, but able to have 

0 connections with neurons of the preceding layers 

J; 470 (inputs) or following layers (outputs) . In general, 

W onlv neurons of successive layers are connected. 

f\ Learning consists of progressively modifying the 

weights values w±j until the outputs of the network, 
iJ which is constituted of a certain number of neuron 

C] 475 layers, correspond to the required outputs. 

0 In order to achieve this, one defines a 

classification error one wishes to minimise. The most 
commonly used error is the mean square error, defined 

NoutpiUs ^ 

by £•= Y^{z,^-z[^'^'"'^) . The method consists of making a 

k=\ 

4 80 gradient decrease on the weights by the equation 

with a > 0. This equation, when developed, 

provides the correction formula for each weight of the 
network . 

• Fuzzy expert system 
485 A fuzzy expert system is another system model for 

decision-making. It has the advantage over a neural 
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network of giving a form of explanation for its 
decisions. It is constituted of a set of calculation 
units, the "fuzzy inference rules". 
4 90 A fuzzy inference rule is an entity of the form 

"if <premise> then <conclusion>" • The premise is the 
part sought to put into correspondence with the input 
data. 

In fact, a fuzzy set is a set whose borders are 
4 95 progressive, contrary' to a classic set, which has 
defined borders. Thus an element is more or less part 
P of each fuzzy set. When the data are of dimension 1, a 

P classic set can be represented by a rectangle 

Cf 

(membership=l inside, 0 outside) , whereas a fuzzy set 
j^-J 500 can be a triangle, a trapezium a Gaussian form... 

yi In the same way as above, learning consists of 

t:; modifying progressively the parameter values until the 

outputs of the fuzzy expert system correspond to the 

C! 

\[ required outputs. 

f'^ 505 Four types of parameters are calculated by 

learning: the position and width of the fuzzy sets of 
premises, the weights of the rules and the degree of 
activation of each class for each rule. 

In the operational utilisation phase, the fuzzy 

510 expert system provides, besides the class attributed to 
a seismic event, the list of rules applicable by 
descending order of relevance. Some of these rules can 
be in contradiction with the others, which makes it 
possible to examine alternative solutions, but it is 

515 the aggregation of the result of all the rules which 
provides the overall result. 
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Therefore, at the disposal of the user, there 
exists : 

- the initial decision (earthquake, explosion, 

520 rock burst) ; 

- the list of applicable rules; 

- the list of rules in contradiction to this 
decision; 

the reason for the decision of each rule 
525 (through examination of the coherence between the data 
and the corresponding fuzzy sets) . 

An example of decision rule found by the system is 

I given below: 

C! if (Time is the middle of the afternoon) 

iu 

U 53 0 and (Latitude is very close to 43-5°N) 

|j and (Longitude is very close to S.S^E) 

^ and (Magnitude is about 2.7) 

Ij and (Date is preferably Saturday) 

^^^1 then (with level of conf idence=0 . 8 ) 

P 535 (earthquake is improbable) 

(explosion is probable) 

(rock burst is improbable) . 

In the invention, for reasons of difficult 
convergence, this gradient is parameterised by 

540 introducing intermediate variables. If one wishes to 
carry out a gradient decrease on a parameter p with 
p=(t)(s), (|) being a dif f erentiable function, strictly 
monotonic, independent from p and values of examples 
serving for learning, one has the same final solutions 

545 by carrying out a gradient decrease on s. The advantage 
of such a change of variable is that it becomes 
possible to change the way of reaching the solution. 
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and in particular to facilitate convergence in 
difficult cases. 
550 In the invention, the following parameters are 

optimised: 

(1) The position y of the fuzzy sets of premises: 
when the gradient decrease is applied directly on this 
parameter, one generally obtains difficult convergence. 
555 This is explained by the fact that the variation of the 
position y of the fuzzy sets of premises is not an 
increasing function of the distance from the example, 
p^:. This phenomenon is corrected by posing x=y/a. 



I- ^ 

yi 

5. 



(2) The width a of the fuzzy sets of premises: 
560 when the data are structured in sets of very different 

sizes, the algorithm cannot converge. By studying the 
relative variation a/a, one discovers that it is not 
bounded (that is to say that nothing prevents it 
p tending towards infinite values) . When the data are 

"'^ 565 very grouped, this variation does in actual fact take 

|:^t very high values. In order to have a lower relative 

modification when the data are close, one poses 
s=ln(2a^) . 

(3) The weights p of the rules: this is the most 
570 difficult parameter to set. With a direct gradient 

decrease, the lowest weights diminish and become 
negative, which makes them lose all significance and 
makes the algorithm diverge. Thus one chooses a 
function for positive activation by imposing a 
575 supplementary restriction: for different examples with 
the same activation level of the rule, the variation of 
this level must be the same if the conclusions are 
equal. The consequence is that the relative variation 
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of the weights of the rules must be constant when the 
580 examples have the same degree of belonging to the fuzzy 
sets. This is carried out by posing r=ln(p) . 

(4) The degree of activation d of each class for 
each rule . 

The gradient decrease is thus carried out not on 
585 y, a and p, but on: 

• x=y/a 

• s=ln(2a^) 
y< • r=ln(p) 

5:;: For d, one does not carry out any change of variable. 

D 590 These changes in variables ensure veiry good quality of 

convergence and allow very efficient fuzzy expert 
Jjj systems to be obtained. 

5 • Base of eKamples and validation 

|i| The base of examples used must verify two 

C{ 595 fundamental principles: 

p - to be qualitatively representative of the real 

^""^ problem (distribution of examples in conformity with 

the real distribution) ; 

to be quantitatively representative of the 
600 problem (number of examples sufficient to constitute a 
satisfactory sampling) . 

There are several methodologies for learning and 
validation. In the simplest procedure, one divides the 
base of examples into two disjoined bases: the learning 
605 base and the test base. One trains the system by 
learning on the first and one verifies its correct 
operation on the second. A base of examples that does 
not conjointly verify the two properties mentioned 
above runs the risk of leading to a system incapable of 
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610 generalising correctly, that is to say of operating on 
new examples, not presented during the learning stage. 

Whether it concerns artificial neuron networks, 
fuzzy expert systems or more generally any system 
conceived by statistical learning on experimental data, 

615 it is of prime importance to use a base of examples 
which is adequate in quality and in quantity, and to 
validate the system produced by rigorous procedures . 

Detailed description 

• Description of a seismic event 
A seismic event to be identified can be described 

by: 

- the signals data coming from the network of 
seismic stations, or 

- highest level properties, measurable directly or 
calculated by geophysical models. For example, one can 
use the location of the event (latitude and longitude) , 
its magnitude and the moment it occurred (time and day 
of the week) . For example the high level information is 
as follows: Thursday 7 April 1966 at 12hr, an 
earthquake of magnitude 1.4 occurred at longitude 
02^35*06" East and latitude 49^12 '25" North. 

• The neuro- fuzzy classifier 
The neuro- fuzzy classifier (NFC expert) , as shown 

in figure 8, is constituted of a neuro-fuzzy coding of 
the data followed by a multilayer perceptron. It is 
applicable to high level data. 

NeurO'fuzzy coding consists of associating several 
coding cells to each input variable (or set of input 
variables) , each cell having a region of influence 
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modelled by a function defining its activation 
mechanism. The presentation of a vector of values then 
generates an activation diagram for the associated 
coding cells. 

545 Figure 9 shows this mechanism for activating 

coding cells. Presentation of a value generates an 
activation diagram corresponding to the impulse 
response of each activation function to the value 
presented. The levels of grey attributed to the centres 
650 of the cells indicates their activation level comprised 
between 0 and 1 (black: 1, white: 0). 

Figure 10 shows examples of activation diagrams 
generated by presentation of typical values. It 
W concerns a coding of cursor type. The low values (or 

fl 655 high respectively) preferentially activate the left 

f cells (or right respectively) . 

|j The interests of this coding are multiple: 

- by its very nature, it makes it possible to 
0 represent incomplete, imprecise or uncertain data and 

660 to use them efficiently for the decision-making; 

- by its non-linear processing properties of data, 
it facilitates later processing (here, the 
classification) . 

This neuro- fuzzy coding is carried out in several 
665 successive stages: definition of sub-groups of 
characteristics, choice and placing of coding cells 
assigned to each group, definition of parameters for 
the region of influence of each cell. The details of 
this procedure are explained in document reference [8] . 



E3 
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670 Once the data have been coded, they are analysed 

by the multilayer perceptron which then calculates the 
class . 

• The fuzzy expert system 

In an embodiment the system according to the 
675 invention comprises a single processing branch based on 
such a fuzzy expert system. 

The fuzzy expert system (FES expert) is also 
applicable to high level data. 

In figure 11, a fuzzy expert system with five 
£ 68 0 rules is shown (one rule per line) . 

Cr:;:f 

Q For each line, the five columns on the left 

represent the premises and correspond to five entries: 
^"1 time, latitude, longitude, magnitude and date. The 

in premises are composed of fuzzy sets in Gaussian form 

|,: 685 which cover the domains of the input variable leading 

to reinforcement of the activity of the rule. 

P 

Hj Four types of parameters are calculated by 



learning: the position and the width of the fuzzy sets 
of premises (columns 1 to 5) , the weights of the rules 

6 90 (column 6) , which makes it possible to specify the 
degree of importance of each rule in the decision 
process, and the degree of activation of each class 
(natural earthquake, explosion or rock burst) for each 
rule (columns 7 to 9) . 

695 Each time an example to be classified is 

presented, a calculation is made of the contents of 
column 10 and column 6: 

• Column 10 gives the activation of each rule (and 
thus enables estimation of its fit with the case being 

700 processed) . 
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• Line 6 is the synthesis of the decisions from 
the five rules and gives the overall response of the 
fuzzy expert system (here, the decision is 
"explosion") . This synthesis is made by calculating the 
705 barycentre of the decisions from all the rules (columns 
7 to 9) weighted by the corresponding activation level 
(column 10) . In figure 11, the position of the 
barycentre for each class is symbolised by a vertical 
trace line 6, columns 7 to 9 . 
710 Learning is carried out in two stages: 

- a first phase consists of positioning the fuzzy 
D sets (centres and widths) , for example by means of an 

n 

algorithm called fuzzy C-averages, as described in 
W document reference [9] ; 

U1 715 - a second phase consists of producing a gradient 

J decrease on the four types of parameters . 

• Neural network with local connections and shared 



weights 

Contrary to the two preceding branches, the neural 
72 0 network with local connections and shared weights (TDNN 
expert) allows input of the seismic signals themselves 
and learns to extract by learning not only the decision 
procedure, but also the discriminant parameters which 
will serve as base for this decision. This neural 
725 network is of the multilayer perceptron type with local 
connections and shared weights taking as input the pre- 
processed spectrograms of seismic signals, as described 
in document reference [10] . These spectrograms are 
obtained by applying a sliding window Fourier transform 
730 on the signal. 
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Figures 12A to 12D show the successive pre- 
processing applied to each seismic signal, resulting in 
a final spectrogram with 15 frequency bands: figure 12A 
shows the initial signal; figure 12B shows the 
735 spectrogram deduced from the signal with 50 frequency 
bands; figure 12C shows the "noise-suppressed" 
spectrogram; figure 12D shows the spectrogram after 
reduction from 50 to 15 frequency bands- 

The spectrogram obtained is then pre-processed and 
74 0 next presented as input for a neural network of the 
pj TDNISF type. Each network is specialised in the treatment 

D of signals recorded by a given station. 



D 

w 



Figure 13 shows the architecture of a TDNN network 
specialised in classification of spectrograms deduced 
745 from signals recorded by a given seismometer, this 
network comprising four neuron layers. The input layer 
has local connections and shared weights (4 frames with 
a delay of 2 frames) with the first hidden layer. The 
latter also has local connections with shared weights 



£3 

750 (9 frames with a delay of 5 frames) with the second 
hidden layer, totally connected to the last layer. 

The shared weights make the architecture stronger 
for small differences in phase recordings or missing or 
erroneous frames. However, because of the speed of 

755 propagation of the P waves (compression) and S waves 
(shearing) , the time between the arrival of phase P and 
phase S varies in function of the distance between the 
recording station and the epicentre of the event, which 
complicates learning. The solution adopted consists of 

76 0 aligning the recording of phase P on the 10th frame and 
that of phase S on the 60th frame. 
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• Final decision making 

For the final decision-making, it is assumed that 
all the outputs are comprised within the real 
interval [-1, i] . This decision-making consists of an 
association of answers provided by the three branches 
in order to increase reliability. In can be carried out 
by a calculation of mathematical averages on the 
homologous outputs of each of the three branches. For 
each of the three outputs Si of the overall system one 
then has : 

The certitude of the answer is evaluated by a 
coefficient calculable only if the system is in a 
situation for decision-making (that is to say if there 
is one and a single strictly positive output) . This 
coefficient is then equal to the average of the 
absolute values of the outputs: 

doubt 
caution 

reasonable certainty 
high certainty 
almost absolute certainty 



JC <0 .2 
K e] 0.2, 0.4] 
K e] 0 .4, 0 . 6] 
K g] 0.6,0.8] 
K >0.8 



785 



Thus, for example, one can obtain: 
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System 


Class chosen 


Details of responses 
per class 


Degree of 
certainty 


1 


Class 3 


{-0.9 -0.4+0,8) 


0.7 high 
certainty 


2 


Unde t ermined 
1 or 3 


(+0.1 -0,5+0.3) 


Complete 
uncertainty 


3 


Class 1 


(+0.2 -0.6-0.2) 


0.3: caution 


Fusion 


Class 3 


(-0.2 -0.5+0.3) 


0.3: caution 



Example of implementation of the invention; 
discrimination of recrional seismic events 
• Localisation of the event 

The discrimination "natural event / artificial 

event" is a major step in seismic surveillance, carried 
out rapidly from reading the signals during the 
reduction stage, and then improved with each new 
processing. It is estimated that nearly seven years are 
needed for analysts to become completely operational . 
Since these analysts are real experts, it is difficult 
to describe their reasoning method clearly, since it is 
based both on expert know-how and on case -by- case 
reasoning . 

The location of a seismic event is obtained after 
a succession of exchanges between two principal stages: 

- the recording of the different seismic phases 
carried out on signals registered by stations detecting 
the event and the calculation of the magnitude; 

the localisation itself, carried out by a 
mathematical model created by seismologists. 

After the recording stage, the localisation of the 
event can be carried out using a simulation software 
known to those skilled in the art. It uses 



B13340.3DB 



28 



seismological models containing information about the 
speed of waves, the different types of waves and their 
propagation mode, altitude corrections according to 
stations etc. Several localisation hypotheses are 
proposed, associated to a degree of data consistency. 
If they do not satisfy the expert, he modifies his 
phase recordings and then restarts a localisation 
search. This cycle is repeated until a result 
considered to be satisfactory is obtained. 

The quality of the localisation depends on the 
number and quality of the stations used for localising 
the event, together with their azimuth distribution. In 
general, events located in France are better localised 
than events abroad. To improve localisation in the 
latter case, experts regularly consult data from 
abroad. With French data alone, the precision of 
localisation of events in France is on average five 
kilometres. In the best of cases, it is estimated at 
about one kilometre. 

Table II at the end of the description gives an 
example of the results provided by the localisation 
procedure. The upper part resumes the results obtained: 
for each value of time of origin, of magnitude and 
localisation (latitude-longitude) an estimated level of 
inaccuracy is associated. The lower part names the 
stations taken into account for the localisation and 
the mean square errors (MSE) obtained as a function of 
the hypothesis of epicentre depth. In the case of rock 
bursts, the depth is set arbitrarily at a depth of one 
kilometre . 

• Chara.cterization of the event 
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Here we are concerned by local and regional 
seismic events, meaning those in mainland France or 
adjacent regions. These events are often described as 
close as opposed to distant teleseisms (epicentre 
situated at several thousands of kilometres from the 
sensor) . 

Three types of seismic events have to be 
discriminated : 

- earthquakes, seismic events of natural origin; 
terrestrial explosions (blasting in mines, 
quarries, work-sites etc.) or in the sea (bomb 
disposal, weapons testing etc.); 
Cf _ rock bursts corresponding to the collapse of a 

855 mining layer and associated with the operation of the 
mine . 

Analysis of the state of the art has demonstrated 
the failure of approaches based on discrimination 
relying on seismic signals alone. Thus one needs to use 
all the available data by adopting an approach based on 
multi-expert and multi-source merging. The concept of 
the system of automatic discrimination of seismic 
events is based on three modules: 

- the first two (NFC and FES experts) are modules 
865 carrying out discrimination from high level data only, 

deduced by the inverse model of the LDG laboratory. 
Thus, at this level, no seismic signal is taken into 
account directly; 

- the third (TDNN expert) is based on the analysis 

870 of the seismic signals. 
• Data used 

geographic spread of events: 



Q 860 
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The seismic events to be analysed are spread over 
the whole of French mainland territory and an adjacent 
875 perimeter. The epicentres of the events recorded by the 
LDG laboratory between 1962 and 1996 are shown in 
figure 14. 
high level data: 

Each seismic event is characterised by the 
88 0 following information: the date and time of origin of 
the event, the latitude and longitude of the epicentre, 
and its magnitude. 

The time and the date are stored because of the 
p rules about quarry or mine blasting in France, 

B 3 forbidding blasting at night or during weekends or 

y public holidays. Nonetheless, permission is given for 

II certain work- sites, for example to avoid disturbing 

traffic. 

pj The magnitude is recorded since, according to 

f| 890 seismologists, rock bursts reach a typical magnitude 

E:! (about 3) . Furthermore, only earthquakes can produce 

greater magnitudes. Several values of magnitude are 
taken into account when they are available. 

The localisation of the epicentre, characterised 
895 by its latitude and its longitude, is also a major 
characteristic. However, there are certain mines 
located in regions of high seismicity and capable of 
provoking rock bursts, 
low level data: 

90 0 Low level data are signals arriving from the 42 

seismic stations of the LDG laboratory (see figure 1) . 
The pre-processing relates essentially to the creation 
of spectrograms of seismic signals, which are non- 
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stationary. These spectrograms are obtained by 
905 application of a sliding window Fourier transform on 
the signal. To begin with, the signal sampled at 50 Hz, 
is segmented into two- second frames delayed by one 
second by a Hamming code window. Next, the spectral 
energy density is calculated for 50 frequency bands, 
910 eliminating the continuous component. Then a 
logarithmic transformation is applied with noise 
suppression, with a supposedly logarithmic model, in 
each band according to the equation max(ln(l+x)- 
fi {noise) {noise) , 0) , where fx {noise) and g {noise) 

^ 915 correspond to the average and the deviation type of the 

13 

£1 noise estimated over a period anterior to the recording 

l';] of the wave P. Finally, the number of frequency bands 

is reduced from 50 to 15 through a pseudo- logarithmic 

f. 

compression of high frequencies . 

pj 

fi 92 0 • Results obtained 

^"J The system described above classifies French 

D 

regional seismic events with the following level of 
performance : 

- 86% for earthquakes; 
92 5 - 91% for explosions; 

- greater than 99% for rock bursts. 

The overall level of performance is about 90%. 
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TABLE I 

CLOSE soLusTxnt: (f9 & TO 15 SS^ liH ? 1919*837 

JSK5 TXJiS OR, m ^ £*f MSB MSJOf 



09 4I!>^.5 ll.Sn t 2J J U SB iQhojM 

»89 ini^.i 41.11 ji ^ 10 ,??iJbiCd«Bu 



TIME OR: time erf ori9n(TU) ■ 
LAI: latilude of epicente (deg) ■ 
LON: tongttude of epicentre (deg) 
DEP: depth of epicentre (km) 
LM: local magnitude 
MSE: mean square error (s) 
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TABLE II 



1005S6 probfildit /data/si$aic/e^iioit/dat/n>l90.Ul.to8i7:(lS ffllDSI 1 

Number of stations used \\ - in bulletin: yes 
ploc. l8t: 43,2 Im: 5,4hr: » 1H3.1 



regba: 11 In S ih-ai-Pio?aiC8 (13) 

^ck burst* 



Time of origin: 

latitude: 



Depth: 



1S15 5U .3 

0. 43 +/- 1,7 
5.43 ih 2.5 

1. fal set 



njs: .m iflid'iter:3 

Elipse: 95% confidence 

1/2 long axis: 

1/2 short axis 4.611. 
azimuth long axis- ^^$3 
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Dur^on: 1.1 .D m J StatkM 



I 

! 

jpio 
I lit 

! ioa 
I m 
1 ait 



mill 

11! i I i 
l.U !i! 1.M 15. W 25. to! 
43.43 HI 43.44 1 43.50 [ 43.53 I 
5.43 111 5.44 I S.Sl I 5.53 I 

,2S HI .32 1 Mi U5! 

3 !jl 3 I 5 1 M 



10. M 
43.3S f 
5.34 t 
1.50 1 



I 



pli < Time ? dist aii 2^1 per diir I res il sd III res I res I les i res i 



m 
m 



m 



m 



pg 19 20 6.5 25 87 

pj 19 2310.4 5S 81 

ag 19 20 21.1 SS U 

pg lS2lf 8.7 SI % 

Sg IS 20 18.8 11 S$ 

p? 19 20 21.1 m n 

pt IS 20 28.8 m n 

sg 15 20 41.5 UJ 72 

ps 19 20 37.1 307 157 

19 21 3.S 307 1S7 

Sf 15 20 15.1 75 17 



res i res 
I 

1 -.1 ill -.! ! -.9 1 -1.7 ! -1.9 f 

55. I -.2 2.7 HI -.2 I -.8 1 -1.3 i -1.9 f 

I -.4 III -.4 i -.3 ! %3 !-1.5 F 

50. I .0 2.7 iU -.1 I -1.0 I -1.8 i -1.1 r 

I -.2 III -.3 I -.e I -1.2 I -1.7 f 

22.4 .4 I -.3 2.8 III -.3 I -.7 1-1.0 I l.J I 

( .5 ill i I 1.4 1 2.1 I .5 ! 

I .3 III .3 1 .6 1 l.l I 3.S I 

1 -.2 111 -.2 I .1 I .51 .7 I 

1 .4 III .5 ! 2.D ! 3.7 I U i 

I .3 III .3 1 .11 -.3 1-2.3 I 



B13340 .3DB 



34 



REFERENCES 

[1] D. R. Baumgardt and K.A. Ziegler, "Spectral 
evidence for source multiplicity in explosions: 
application to regional discrimination of earthquake 
94 0 and explosion" (Bulletin of Seismological Society of 
America, vol. 78, pp- 1773-1795, 1988). 

[2] P. S. Dysart and J. J. Pulli, "Regional 
seismic event classification at the NORESS array: 
seismological measurement and the use of trained neural 
945 networks" (Bulletin of Seismological Society of 
N' America, vol. 80, pp 1910-1933, 1990). 

p [3] P. W. Pomeroy, W. J. Best and T. V. McEvilly, 

^ "Test ban treaty verification with regional data: a 

Id review" (Bulletin of Seismological Society of America, 

y«j 950 vol. 72, 6 , pp S89-5129, 1982). 

I [4] M. Musil et A. Plesinger, "Discrimination 

|j between local microearthquakes and quarry blasts by 

^ multilayer perceptrons and Kohonen maps" (Bulletin of 

Cf Seismological Society of America, vol. 86, n°4, pp. 

^ 955 1077-1090, 1996) . 

[5] S.R. Taylor, "Discrimination between nuclear 
explosions and earthquakes" (Energy and Earth Sciences, 
pp. 56-57, 1990) . 

[6] F. U. Dowla, S. R. Taylor and R. W. Anderson, 
96 0 " Seismic discrimination with artificial neural 
networks: preliminary results with regional spectral 
data" (Bulletin of Seismological Society of America, 
vol. 80, n<^5, pp. 1346-1373, 1990). 

[7] M. Nicolas, J. -P. Santoire and P.-Y. Delpech 
965 "Intraplate seismicity: new seismotectonic data in 



B13340 .3DB 



35 



western Europe" (Tectonophysics, N° 17 9, pp. 27-53, 
1990) . 

[8] S. Muller, P. Garda, Muller, Y. Cansi 

"Seismic events discrimination by neuro- fuzzy merging 
970 of signal and catalogue features" (Physics Chemistry of 
The Earth (A), vol. 24, 3, pp- 201-206, 1999). 

[9] B. T. W. Cheng, D. B. Goldgof, L. 0. Hall, 
"Fast fuzzy clustering" (Fuzzy Sets and Systems 93, 49- 
56, 1998) . 

975 [10] A. Klaassen, X. Driancourt, S. Muller, J.-D. 

Muller, "Classifying regional seismic signals using 
TDNN-alike neural networks" (International Conference 
0 On Artificial Neural Networks' 98, SkSvde, Sweden, 2-4 

W September 1998) . 

ri 

w 
e:i 

6. 



13 



B13340.3DB 



36 

Amended claims to file when entering the National Phase 



CLAIMS 

1. System of artificial intelligence for 
classification of events, objects or situations from 
signals and from discriminant parameters produced by 
models, comprising at least one processing branch 
comprising a fuzzy expert system taking a decision 
according to high level properties and discriminant 
parameters of lower level extracted from signals by 
signal processing type procedures, and capable of 
explaining its decision to the user through the 
intermediary of rules selected by , order of 
applicability. 

2. System according to claim 1 in which, in the 
fuzzy expert system a gradient decrease is carried out 
on the parameters: 

• X = y/a 

• s = ln/2a^) 

• r = In (p) 

• d 

with: 

• y: position of fuzzy sets of premises 

• a: width of fuzzy sets of premises 

• p: weights of rules 

• d: degree of activation of each class for 
each rule. 

3. System according to claim 1, which is a multi- 
expert system constituted of at least two independent 



37 

Amended claims to file when entering the National Phase 

processing branches, organising themselves 

automatically through statistical learning on data 
bases, having particular properties and merged by a 
high level decisional system. 

4. System according to claim 3, in which one 
branch comprises a neuro-fuzzy classifier taking its 
decisions based on high level properties and lower 
level discriminant parameters extracted from signals by 
signal processing type procedures. 

5. System according to claim 3, in which one 
branch comprises a neural network with local 
connections and shared weights constituted of banks of 
non-linear adaptable filters, itself extracting 
discriminant information for time-frequency 
representation of the corresponding signals. 

6. System according to claim 1, which is a system 
for classification of geophysical events. 

7. System according to claim 6, in which the high 
level properties are the localisation, the magnitude, 
the time and the date. 
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